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Abstract 

In the last years, Google's PageRank optimization problems have been extensively studied. In 
that case, the ranking is given by the invariant measure of a stochastic matrix. In this paper, we 
consider the more general situation in which the ranking is determined by the Perron eigenvector 
of a nonnegative, but not necessarily stochastic, matrix, in order to cover Kleinberg's HITS 
algorithm. We also give some results for Tomlin's HOTS algorithm. The problem consists then 
in finding an optimal outlink strategy subject to design constraints and for a given search engine. 

We study the relaxed versions of these problems, which means that we should accept weighted 
hyperlinks. We provide an efficient algorithm for the computation of the matrix of partial deriva- 
tives of the criterion, that uses the low rank property of this matrix. We give a scalable algorithm 
that couples gradient and power iterations and gives a local minimum of the Perron vector opti- 
mization problem. We prove convergence by considering it as an approximate gradient method. 

We then show that optimal linkage stategies of HITS and HOTS optimization problems verify 
a threshold property. We report numerical results on fragments of the real web graph for these 
search engine optimization problems. 



1. Introduction 

Motivation. Internet search engines use a variety of algorithms to sort web pages based on their 
text content or on the hyperlink structure of the web. In this paper, we focus on algorithms 
that use the latter hyperlink structure, called link-based algorithms. The basic notion for all 
these algorithms is the web graph, which is a digraph with a node for each web page and an 
arc between pages / and j if there is a hyperlink from page / to page j. Famous link-based 
algorithms a re Page Rank llBP98ll . HITS IIKle99L SALSA IILMOOl and HOTS IITom031 . See 
also |LM05' 'LM061 for surveys on these algorithms. The main problem of this paper is the 
optimization of the ranking of a given web site. It consists in finding an optimal outlink strategy 
maximizing a given ranking subject to design constraints. 

One of the main ranking methods relies on the PageRank introduced by Brin and Page IIBP98I . 
It is defined as the invariant measure of a walk made by a random surfer on the web graph. When 
reading a given page, the surfer either selects a link from the current page (with a uniform prob- 
ability), and moves to the page pointed by that link, or interrupts his current search, and then 
moves to an arbitrary page, which is selected according to given "zapping" probabilities. The 
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rank of a page is defined as its frequency of visit by the random surfer It is interpreted as the 
"popularity" of the page. The PageRank optimization problem has been studied in several works: 
llAEoel lMV06l IdKNvDOSI |lT09l ICJB 1 Ol IFABG 1 01 . The last two papers showed that PageRank 
optimization problems have a Markov decision process structure and both papers provided ef- 
ficient algorithm that converge to a global optimum. Csaji, lungers and Blondel in I CJBlOl 
showed that optimizing the PageRank score of a single web page is a polynomial problem. Fer- 
coq, Akian, Bouhtou and Gaubert in [FABGlOl gave an alternative Markov decision process 
model and an efficient algorithm for the PageRank optimization problem with linear utility func- 
tions and more general design constraints, showing in particular that any linear function of the 
PageRank vector can be optimized in polynomial time. 

In this paper, we consider the more general situation in which the ranking is determined 
by the Perron eigenvector of a nonnegative, but not necessarily stochastic, matrix. The Perron- 
Frobenius theorem (see [BP941 for instance) states that any nonnegative matrix A has a nonneg- 
ative principal eigenvalue called the Perron root and nonnegative principal eigenvectors. If, in 
addition, A is irreducible, then the Perron root is simple and the (unique up to a multiplicative 
constant) nonnegative eigenvector, called the Perron vector, has only positive entries. This prop- 
erty makes it a good candidate to sort web pages. The ranking algorithms considered differ in 
the way of constructing from the web graph a nonnegative irreducible matrix from which we 
determine the Perron vector Then, the bigger is the Perron vector's coordinate corresponding to 
a web page, the higher this web page is in the ranking. In |„Kee93] , such a ranking is proposed 
for football teams. The paper IISaa87l uses the Perron vector to rank teachers from pairwise com- 
parisons. See also | Vig091 for a survey on the subject. When it comes to web page ranking, the 
PageRank is the Perron eigenvector of the transition matrix described above but the HITS and 
SALSA algorithms also rank pages according to a Perron vector 

The HITS algorithm f Kle99l is not purely a link-based algorithm. It is composed of two steps 
and the output depends on the query of the user. Given a query, we first select a seed of pages that 
are relevant to the query according to their text content. This seed is then extended with pages 
linking to them, pages to which they link and all the hyperlinks between the pages selected. We 
thus obtain a subgraph of the web graph focused on the query. Then, the second step assigns each 
page two scores: a hub score v and an authority score u such that good hubs should point to good 
authorities and good authorities should be pointed to by good hubs. Introducing the adjacency 
matrix A of the focused graph, this can be written as v = pAu and u = pA^v with p € M+, 
which means that the vector of hub scores is the Perron eigenvector of the matrices A^A and 
that the vector of authority scores is the Perron eigenvector of AA^ . The construction of HITS' 
focused subgraph is a combination of text content relevancy with the query and of hyperlink 
considerations. Maximizing the probability of appearance of a web page on this subgraph is thus 
a composite problem out of the range of this paper. We shall however study the optimization of 
HITS authority, for a given focused subgraph. 

The SALSA algorithm ILMOOII shares the same first step as HITS. The second step consists in 
the computation of the invariant measure of a stochastic matrix which consists in a normalization 
of the rows of A^A. In fact, with natural assumptions, this measure is proportionnal to the 
indegree of the web page. The authors show that the interest of the ranking algorithm lies in the 
combination of the two steps and not in one or the other alone. Thus from a hyperlink point of 
view, optimizing the rank in SALSA simply consists in maximizing the number of hyperlinks 
pointing to the target page. We shall not study SALSA optimization any further. 

We also studied the optimization of Tomlin's HOTS scores fTom031. In this case, the ranking 
is the vector of dual variables of an optimal flow problem. The flow represents an optimal 
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distribution of web surfers on the web graph in the sense of entropy minimization. The dual 
variable, one by page, is interpreted as the "temperature" of the page, the hotter a page the 
better. TomUn showed that this vector is solution of a nonlinear fix point equation: it may 
be seen as a nonlinear eigenvector. Indeed, we show that most of the arguments available in 
the case of Perron vector optimization can be adapted to HOTS optimization. We think that 
this supports Tomlin's remark that "malicious manipulation of the dual values of a large scale 
nonlinear network optimization model [. . . ] would be an interesting topic". 

Contribution. In this paper, we study the problem of optimizing the Perron eigenvector of a con- 
trolled matrix and apply it to PageRank, HITS and HOTS optimization. Our first main result is 
the development of a scalable algorithm for the local optimization of a scalar function of the Per- 
ron eigenvector over a set of nonnegative irreducible matrices. Indeed, the global Perron vector 
optimization over a convex set of nonnegative matrices is NP-hard, so we focus on the searching 
of local optima. We give in Theorem[T]a power-type algorithm for the computation of the matrix 
of the partial derivatives of the objective, based on the fact that it is a rank 1 matrix. This theorem 
shows that computing the partial derivatives of the objective has the same computational cost as 
computing the PeiTon vector by the power method, which is the usual method when dealing with 
the large and sparse matrices built from the web graph. Then we give an optimization algorithm 
that couples power and gradient iterations (Algorithms [2] and [3]l. Each step of the optimization 
algorithm involves a suitable number of power iterations and a descent step. By considering this 
algorithm to be an approximate projected gradient algorithm ||Pol97l |PP02| . we prove that the 
algorithm converges to a stationary point (Theorem|2]i. Compared with the case when the num- 
ber of power iterations is not adapted dynamically, we got a speedup between 3 and 20 in our 
numerical experiments (Section [7]i together with a more precise convergence result. 

Our second main result is the application of Perron vector optimization to the optimization 
of scalar functions of HITS authority or HOTS scores. We derive optimization algorithms and, 
thanks to the low rank of the matrix of partial derivatives, we show that the optimal linkage 
strategies of both problems satisfy a threshold property (Propositions |9] and 12 1. This property 



was akeady proved for PageRank optimization in |dKNvD08 FABG10|. As in |IT09 CJBIO] 
IFABGlOl we partition the set of potential links (;, j) into three subsets, consisting respectively 
of the set of obligatory links, the set of prohibited links and the set of facultative links. When 
choosing a subset of the facultative links, we get a graph from which we get any of the three 
ranking vectors. We are then looking for the subset of facultative links that maximizes a given 
utility function. We also study the associated relaxed problems, where we accept weighted adja- 
cency matrices. This assumes that the webmaster can influence the importance of the hyperlinks 
of the pages she controls, for instance by choosing the size of the font, the color or the position 
of the link within the page. In fact, we shall solve the relaxed problems and then give conditions 
or heuristics to get an admissible strategy for the discrete problems. 

Related works. As explained in the first part of the introduction, this paper extends the study 
of PageRank optimization developped in ||AL06l |MV06l IdKNvDOSI IITU91 ICJB 1 Ol IFABGl 01 to 
HITS authority |Kle99| and HOTS [lbni03 | optimization. 

We based our study of Perron eigenvector optimization on two other domains: eigenvalue 
optimization and eigenvector sensitivity. There is a vast literature on eigenvalue and eigenvector 
sensitivity with many domains of application (see the survey rHA89 | for instance). These works 
cope with perturbations of a given system. They consider general square matrices and any eigen- 
value or eigenvector. They give the directional derivatives of the eigenvalue and eigenvector of a 
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matrix with respect to a given perturbation of this matrix ||Nel76l|MS88l . Perron eigenvalue and 
eigenvector sensitivity was developped in IIDN84||DN85I . 

This led to the development of eigenvalue optimization. In IICDW75I IOve9 1 1 ISF95II the au- 
thors show that the minimization of a convex function of the eigenvalues of symmetric matrices 
subject to linear constraints is a convex problem and can be solved with semi-definite program- 
ming. Eigenvalue optimization of nonsymmetric matrices is a more difficult problem. In general, 
the eigenvalue is a nonconvex nonlipschitz function of the entries of the matrix. The last section 
of iL096i proposes a method to reduce the nonsymmetric case to the symmetric case by adding 
(many) additional variables. Another approach is developped in IIOW88I : the author derives de- 
scent directions and optimality conditions from the perturbation theory and uses so-called dual 
matrices. 

In the context of population dynamics, the problem of the maximization of the growth rate 
of a population can be modeled by the maximization of the Perron value of a given family of 
matrices. This technique is used in |Log08| to identify the parameters of a population dynamic 
model, in llBCF^l II for chemotherapy optimization purposes. An approach based on branching 
Markov decision processes is presented in IIRW82I . Perron value optimization also appears in 
other contexts like in the minimization of the interferences in a network | BS07 |. 

Apart from the stochastic case which can be solved by Markov decision techniques, like for 
PageRank, the search for a matrix with optimal eigenvectors does not seem to have been much 
considered in the literature. Indeed, the problem is not well defined since when an eigenvalue is 
not simple, the associated eigenspace may have dimension greater than 1 . 

Organization. The paper is organized as follows. In Section|2] we introduce Perron eigenvector 
and eigenvalue optimization problems and show that these problems are NP-hard problems on 
convex sets of matrices. We also point out some problems solvable in polynomial time. In 
Section [5] we give in Theorem [T] a method for the efficient computation of the derivative of 
objective function. Then in SectionH] we give the coupled power and gradient iterations and its 
convergence proof. In Sections|5]andj6] we show how HITS authority optimization problems and 
HOTS optimization problems reduce to Perron vector optimization. Finally, we report numerical 
results on a fragment of the real web graph in Section]?] 

2. Perron vector and Perron value optimization problems 

Let M e R"^" be a (elementwise) nonnegative matrix. We say that M is irreducible if it is 
not similar to a block upper triangular matrix with two blocks via a permutation. Equivalently, 
define the directed graph with n nodes and an edge between node / and j if and only if Mij > 0: 
M is irreducible if and only if this graph is strongly connected. 

We denote by p(M) the principal eigenvalue of the irreducible nonnegative matrix M, called 
the Perron root. By Perron-Frobenius theorem (see 0BP94I for instance), we know that p{M) > 
and that this eigenvalue is simple. Given a normalization A^, we denote by u(M) the correspond- 
ing normalized eigenvector, called the Perron vector. The normalization is necessary since the 
Perron vector is only defined up to positive multiplicative constant. The normalization function 
A' should be homogeneous and we require u{M) to verify N{u(M)) = 1 . The Perron-Frobenius 
theorem asserts that u{M) > elementwise. The Perron eigenvalue optimization problem on the 
set Ai can be written as: 

min f(p(M)) (1) 

MeAl 
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The Perron vector optimization problem can be written as: 

min f{u(M)) (2) 

MeM. 

We assume that / is a real valued continuously differentiable function; A1 is a set of irreducible 
nonnegative matrices such that M - h(C) with h continuously differentiable and C a closed 
convex set. These hypotheses allow us to use algorithms such as projected gradient for the 
searching of stationary points. 

We next observe that the minimization of the Perron root and the optimization of a scalar 
function of the Perron vector are NP-hard problems and that only exceptional special cases ap- 
pear to be polynomial time solvable by current methods. Consequently, we shall focus on the 
local resolution of these problems, with an emphasis on large sparse problems like the ones 
encountered for web applications. We consider the two following problems: 

PERRONROOT_MIN: given a rational linear function A : ^ M"' and a vector b in Q'", 
find a matrix M that minimizes p(M) on the polyhedral set {M e M"^" | A(M) < b , M > 0). 

PERRONVECTOR_OPT: given a rational Unear function A : E"^" ^ W", a vector b in Q"\ 
a rational function / : M" — > M and a rational normalization function A^, find a matrix M that 
minimizes f(u(M)) on the polyhedral set {M e M"^" | A(M) < b , M > 0], where u(M) verifies 
N(u(M)) = 1. 

In general, determining whether all matrices in an interval family are stable is a NP-hard 
problem IB TOOL The corresponding problem for nonnegative matrices is to determine whether 
the maximal Perron root is smaller than a given number Indeed, as the Perron root is a mono- 
tone function of the entries of the matrix (see Proposition]?] below), this problem is trivial on 
interval matrix families. However, we shall prove NP-hardness of PERRONROOT_MIN and 
PERRONVECTOR_OPT by reduction of linear multiplicative programming problem to each of 
these problems. The linear multiplicative problem is: 

LMP: given a nxm rational matrix A and a vector b in Q'", find a vector x e M" that minimizes 
X1X2 on the polyhedral set {x e M" | Ax < b , xj , X2 > 0). 

A theorem of Matsui |,Mat96,il states that LMP is NP-hard. We shall need a slightly stronger 
result about a weak version of LMP: 

Weak-LMP: given e > 0, a n x m rational matrix A and a vector b in Q'", find a vector x e Q 
such that X1X2 < y\y2 + e for all y e Q, where Q = {x eM." \ Ax < b , xi, ^2 > 0). 

Lemma 1. Weak-LMP is a NP-hard problem. 

Proof. A small modification of the proof of Matsui fMatOSl gives the result. If we replace 
g{xQ,yQ) < by g{xQ,yQ) < -2 in Corollary 2.3 we remark that the rest of the proof still holds 
since n^p'^'^ + - 4;?'*"+' < -2 for all n > 1 and p - n"\ Then, with the notations of ||Mat96l , 
we have proved that the optimal value of P\{M) is less than or equal to Ap^" - 2 if and only if 
Mx - 1 has a - 1 valued solution and it is greater than or equal to Ap^" if and only if Mx - 1 
does not have a - 1 valued solution. We just need to choose e < 2 in problem P\{M) to finish 
the proof of the lemma. □ 

Proposition 1. PERRONROOTJAIN and PERRONVECTOR.OPT are NP-hard problems. 

Proof. We define the matrix M by the matrix with lower diagonal equal to x and 1 on the top left 
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We set the admissible set X for the vector x as X = {x e M^lAx >b,x>Q] with a rational p xm 
matrix A and a rational vector b of length p. 

For the eigenvector problem, we set the normalization ui - 1 and we take f(u) - u„. We 
have u„ - p, so the complexity is the same for eigenvalue and eigenvector optimization is this 
context. 

Now minimizing p{M) is equivalent to minimizing xiX2 . . .x„ because the n-th root is an 
nondecreasing function on M+. We thus just need to reduce in polynomial time the e-solvability 
of weak-LMP to the minimization of xiX2 . . .x„ on X. 

As X i-> log(xiX2 . . . x„) is a concave function, either the problem is unbounded or there exists 
an optimal solution that is an extreme point of the polyhedral admissible set (Theorem 3.4.7 
in |BS06 1). Hence, using Lemma 6.2.5 in IIGLS88I we see that we can define a bounded problem 
equivalent to the unbounded problem and with coefficients that have a polynomial encoding 
length: in the following, we assume that the admissible set is bounded. 

Given a rational number e > 0, a rational p x m matrix A' and a rational vector b' of length 
p, let X' :- {x e M™|A'x > b',X[ > 0, X2 > 0) be a bounded polyhedron. Denoting V2 '■- 
minxex' x\X2, we are looking for x' € X' such that \y\yi - V2I < e. 

Compute C := min,E[„] min^^^x' Xj and C := max,j=[„] max^^gx' Xi (linear programs). We first set 
mo = C - C + 1 so that mo > and mo > -C and e M" defined by f"'° = if / € {1,2) and 
f"'" = mif / > 3. Let 

Xo := {x e M"'|A'x >b' - A'f'"°, x > 0) . 
We have V2 = mini<=x' x\X2 - minigXo X1X2. Let vo '.- minfgx xiX2 . . . x„. For all x e Xq, we have 
xiX2(mo + C)""^ < X1X2 . . . x„ < xiX2('wo + C)""^ , 



so that V2(mo + C)""^ < vo < v^iniQ + C)"" 



We now set 



m := maxl-C + 2' 



n-3 



c 



,C-2C,1) 



e (mo + C)«-2 

and we define f'" and X in the same way as and Xq. Remark that m an encoding length 
polynomial in the length of the entries. Let v„ \- minj^gx -^i-^a ■ --Xn- For all x e X, we have 
V2(m + C)""^ < v„ < v'2(m + C)""^ and x' = x - f'" is a point of X' with x^ x^ 

As V2(mo + C)""2 < Vo, m > -C + 2"-^e 

(m+C)" " 



"1-^2 

^V2. As m > C - 2C, ^ > , 

^ — ;n+C 2 



X1X2. 

, SO that — ^— > 



:57;yT('« + C) and , , „ , 



> eV2. 



Denote M := (m + C)"-^ and A := (m + C)"-^ _ (,„ + q«-2. < < (M + A)v'2 We have 
A := S'^Io ("^^)m*C"-2-* - S^IZq ("f )»J*C"-2-* < C{m + C)"-^ Hence, M > fv2. As A > 0, 
M + A > -V2. We obtain 



e > 



AMv2 



1 



M(M + A) M 



M + A 
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)MV2 > V2 



M + A 



Finally 

= <V2< r + e . 

(mo + C)"-2 (wo + C)"-^ 

which proves that weak-LMP reduces to the minimization of xi jca . . . jc„ on X. □ 

The general Perron eigenvalue optimization problem is NP-hard but we however point out 
some cases for which it is tractable. The following proposition is well known: 

Proposition 2 ( IKin61ll ). The eigenvalue p(M) is a log-convex function of the log of the entries 
of the nonnegative matrix M. 

This means that log op o exp is a convex function, where exp is the componentwise exponen- 
tial, namely if < cf < 1 and A and B are two nonnegative n x n matrices then for C,j - A" .b!t", 
piOKpiATpiB)'-'. 

Corollary 1. The optimization problem 

min p{M) 

Meexp(C) 

with C convex is equivalent to the convex problem 

min log op o exp(L) 

The difference between this proposition and the previous one is that here M - h(C) = exp(C) 
whereas previously we had h affine. This makes a big difference since an e-solution of a convex 
program can be found in polynomial time [BTNOl 1 . 

Remark 1. The largest singular value (which is a norm) is a convex function of the entries of the 
matrix. For a symmetric matrix, the singular values are the absolute values of the eigenvalues. 
Thus minimizing the largest eigenvalue on a convex set of nonnegative symmetric matrices is a 
convex problem. 

In order to solve the signal to interference ratio balancing problem, Boche and Schuber IIBS07I 
give an algorithm for the global minimization of the Perron root when the rows of the controlled 
matrix are independently controlled, ie when the admissible set is of the form x . . . x Z„ and 
Zk 6 Z.k is the k''' row of the matrix. 

Proposition 3 (|BS07|). Let Z, — Zi x . . . x and T be a fixed positive diagonal matrix. 
If the k''^ row of the matrix V(z) only depends on Zk 6 Zk, V{z) is irreducible for all z & Z 
and if V(z) is continuous on Z (Z can be discrete), then there exists a monotone algorithm 
that minimizes p(rV(z)) over Z, in the sense that p(rV(Zn+i)) ^ pi^ViZn)) for all n > and 
lim„p(ry(z„)) = min,e^p(ry(z)). 

Let X be the admissible set of a Perron eigenvalue optimization problem. Denote tt, the pro- 
jection on the coordinate. Then the minimization of the Perron value over X := ni(X) x . . . x 
7T„(X) is a relaxation of the original problem which is solvable with the algorithm of Proposition[3] 
as soon as all matrices in X are irreducible. We thus get a lower bound for the optimization of 
the Perron eigenvalue problem in a general setting. This is complementary with the local opti- 
mization approach developped in this article, which would yield an upper bound on the optimal 
value of this problem. 
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Remark 2. As developped in IIFABGlOl . general PageRank optimization problems can be for- 
mulated as follows. Let M be the transition matrix of PageRank and p the associated occupation 
measure. When M is irreducible, they are linked by the relation - = hij(p), which 

yields our function h. We also have u(h(p))j - Y^kPik and p,j = u(M)iMij. 

If the set C, which defines the design constraints of the webmaster, is a convex set of occu- 
pation measures, if li is as defined above and if / is a convex function, then 

min/(M(/j(p))) 

peC 

is a convex problem. Thus e-solutions of PageRank optimization problems can be found in 
polynomial-time. Details of this development and exact resolution for a linear / can be found 
in III^ABGIOI . 



3. A power-type algorithm for the evaluation of the derivative of a function of the principal 
eigenvector 

We now turn to the main topic of this paper We give in Theorem [T] a power-type algorithm 
for the evaluation of the partial derivatives of the principal eigenvector of a matrix with a simple 
principal eigenvalue. 

We consider a matrix M with a simple eigenvalue A and associated left and right eigenvectors 
u and V. We shall normalize vby the assumption 2ie[«] ^i^i - 1- The derivatives of the eigenvalue 
of a matrix are well known and easy to compute: 

Proposition 4 ( 0Kat66ll Section II.2.2). Denoting v and u the left and right eigenvectors of a 
matrix M associated to a simple eigenvalue A, normalized such that 2;£[n] ^(M; — 1. tlie derivative 
of A can be written as: 

dA 

dMij ' 

In this section, we give a scalable algorithm to compute the partial derivatives of the function 
/ o M, that to an irreducible nonnegative matrix M associates the utility of its Perron vector In 
other words we compute gij - Yjk ^ This algorithm is a sparse iterative scheme and it is 
the core of the optimization algorithms that we will then use for the large problems encountered 
in the optimization of web ranking. 

We first recall some results on the derivatives of eigenprojectors (see lKat66ll for more back- 
ground). Throughout the end of the paper, we shall consider column vectors and row vectors 
will be written as the transpose of a column vector. Let P be the eigenprojector of M for the 
eigenvalue A. One can easily check that as /I is a simple eigenvalue, the spectral projector is 
P - uv^ as soon as v^u = L We have the relation 

dP 

-SEijP-PEjjS 



where Ejj is the n X n matrix with all entries zero except the //'' and S - (M - APf is the Drazin 
pseudo-inverse of M - AI. This matrix S also satisfies the equalities 

S{M-AI)^{M-AriS^I-P and SP^PS^Q (3) 
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When it comes to eigenvectors, we have to set a normaUzation for each of them. Let be 
the normaUzation function for the right eigenvector. We assume that it is differentiable at u and 
that N{au) - aNiu) for all nonnegative scalars a, which implies ^(u) - u- N(u). We normalize 
V by the natural normalization 2/ m, v; - 1. 

Proposition 5 (|MS88|). Let M be a matrix with a simple eigenvalue A and associated eigen- 
vector u normalized by N(u) — I. We denote S — (M - AI)*. Then the partial derivatives of the 
eigenvector are given by: 

du T 

(M) = -SciUj + {VN{uY SeiUj)u 



where Cj is the vector with i''' entry equal to 1. 

To simplify notations, we denote V/ for Vf{u) and VA^ for VN(u). 

Corollary 2. Let M be a matrix with a simple eigenvalue A and associated eigenvector u nor- 
malized by N(u) — 1. The partial derivatives of the function M i-» (f ou)(M) at M are gij — WiUj, 
where the auxiliary vector w is given by: 

= (-V/^ + (V/ ■ uWN^)S = (-V/^ + (V/ ■ u)p^)(M - Alf 

Proof. By Proposition [5] we deduce that 



8ij 



^ duk dMij ^ duk ^ duk ^ dui 



which is the developed form the result. □ 

This simple corollary already improves the computation speed. Using Proposition[5]directly, 
one needs to compute in every direction, which means the computation of a Drazin inverse 

and matrix-vector products involving M for the cmputation of ( ^jfj- 1 . With Corollary 2 

we only need one matrix-vector product for the same result. The last difficulty is the computation 
of the Drazin inverse S . In fact, we do not need to compute the whole matrix but only to compute 
X for a given x. The next two propositions show how one can do it. 

Proposition 6. Let M be a matrix with a simple eigenvalue A and associated eigenvector u 
normalized by N{u) — 1. The auxiliary vector w of Corollary^is solution of the following 
invertible system 

M - AI -u 
VN^ 



[h'^,w„+i] 
where w„+i e R. 



= [-V/^0] . (4) 



Proof. A nullspace argument similar to this of IINel761 shows that 



is invertible as 



M-AI -u 
VN^ 

soon as A is simple and VN^u - 1. Then the solution w of the system Q verifies the equations 
w^(M - AI) -H w„+iVN^ - -V/^ and w^u - 0. Multiplying the first equahty by u yields 
w„+iVN'^u - -Vf'^u and multiplying it by S yields w^(/-mv^) = -w„+iVA^^5 -Vf'''S. Putting 
all together, we get = (- V/^ -i- {Vf'^u)VN'^)S . □ 
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The next proposition provides an iterative scheme to compute the evaluation of the auxiliary 
vector w when we consider the principal eigenvalue. 

Definition 1. We say that a sequence {xi;)it>Q converges to a point x with a linear convergence 
rate a if limsupj,^„||jiCi; - < a. 

Proposition 7. Let M be a matrix with only one eigenvalue of maximal modulus p - > \A2\. 
With the same notations as in CorollaryyA we denote M — ^Mandz^ — ^{—Vf^ + (yf-u)VN^), 
and we fix a real row vector wq. Then tnefix point scheme defined by 

Vk e N, w[_^i = (-z^ + wJM)(/ - P) 

with P — uv^, converges to w^ — {-V +{V f -uy^N^^M-pI)* with a linear rate of convergence 

\M 
p 

Proof We have - -z^(M(I-P))' +wJ^(M(I-P)f. By assumption, all the eigenvalues of 
M different from 1 have a modulus smaller than 1 . Thus, using the fact that P is the eigenprojector 
associated to 1, we get p(M(I - P)) = !^ < 1. By |Ost55|, (\\M(I - ^ p(M(I - P)), 

so the algorithm converges to a limit w and for all e > 0, \\wk - w\\ - ©((l^^y^)*^). This implies 
a hnear convergence rate equal to 1^ . The limit w satisfies w^ - (-z^ + w^M)(I-P), so w^P - 
and as MP - P, w^M -w^- z^(I - P). We thus get the equality w^(M - I) - zJ- ■ Multiplying 
both sides by (M - /)*, we get: 

w^(M - /)(M - /)* = - w'^P = = z^{M - if 

The last equalities and the relation (J3^^M)* - fiM* show by Proposition [2] that g - wu^ is the 
matrix of partial derivatives of the Perron vector multiplied by V/. □ 

This iterative scheme uses only matrix-vector products and thus may be very efficient for 
a sparse matrix. In fact, it has the same linear convergence rate as the power method for the 
computation of the Perron eigenvalue and eigenvector. This means that the computation of the 
derivative of the eigenvector has a computational cost of the same order as the computation of 
the eigenvector itself. We next show that the eigenvector and its derivative can be computed in a 
single algorithm. 

Theorem 1. IfM is a matrix with only one simple eigenvalue of maximal modulus p — > \A2\, 
then the derivative g of the function f o u at M, such that gij — Yjk ^ limit of the 

sequence (wiuJ)i>o given by the following iterative scheme: 

Mui 
N(Mui) 

"'^^ vjMuM 

= -C^//^ - C^/' ■ «')V< + w]M){I - UMvl^) 
Pi 

where pi — N(Mui), Vf — Vf(ui) and VNi — VN{ui). Moreover, the sequences (uf), (v/) and (wi) 
converge linearly with rate —. 
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Of course, the first and second sequences are the power method to the right and to the left. 
The third sequence is a modification the scheme of Proposition |7] with currently known values 
only. We shall denote one iteration of the scheme of the theorem as 

(Mi+i,Vi+i,w*+i) = POWERDERlVATlVEM(M*,vt,vvi) . 

Proof. The equalities giving m/+i and v',+i are simply the usual power method, so by IIPP73I . they 
convergence linearly with rate to m and v, the right and left principal eigenvectors of M, such 
that P - uv^ is the eigenprojector associated to p(M). Let 

zf := i(-V/,^ + (V/, ■ M,)VA^,^) : 
Pi 

limz; = z by continuity of V/, and VA^ at u. We also have 

w] = (-zf + -vv[M)(/- M,+iv[^i) . 
Pi 

We first show that m>/ is bounded. As in the proof of Proposition |7j p(M{I - P)) = < 1. 
Thus, by Lemma 5.6.10 in IIHJ90L there exists a norm and a < 1 such that M{I - P) is 
ff-contractant. Let S be the unit sphere: Vx e S, \\M{I - P)x\\m < a\\x\\M- By continuity of the 
norm, Ve > 0,3L,V/ > L,Vx e S ,\\^M(I - umvI^)x\\m < (a + e)||x||M. As ^M(I - u^vl^) is 
linear, we have the result on the whole M" space. Thus vv/ is bounded. 

Let us denote M - -M and 

p 

Pi PPi 
so that vvj^j = -z] + wJM(I - uv^). We have: 

/-I 

i-i /-I 

= ivimi - p))' - ^ z^(M(/ - p)f - Y,(~4-i-k - z^)mi - p)f 

k=i) k=0 

By Proposition |7] the sum of the first and second summand correspomd to w/ and converge 
linearly to when I tends to infinity with convergence rate . Corollary 2 asserts that g - 

Um w/m[. For the last one, we remark that for all e > 0, \\(M(I - f ))*|| = 0(2^)*. In order to 
get the convergence rate of the sequence, we need to estimate ||z, - z|| - \\zi - Zi + Zi - z||. 

Hz, -zll < \\(zJumK,\\ + ||-vvfM(Mv^ - UMvl,)\\ + ll^^vvf II + ||z, - z|| 

Pi ppl 

The second and third summands are clearly 0(('^^y^)') . For the first summand, as WNjuj - 1, 
we have 

|z[m/+iI = \-i-^fluMiWflu,)(^NjuM))\ 
Pi 

< -(sup||V/,^|| + sup||VA^,^||)||M,+i - M,|l . 

Pi l>0 l>0 
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As (m/)/>o is bounded and / and are C\ the constant is finite and |zfM;+i|||v/+i|| = ©((l^^y^)')- 
With similar arguments, we also show that ||z/ - z|| - 0(('^^^)') 

Finally, we remai'k that for all k, (zj_i_^ - z^)iM{I - P))'' = 0((!^)'-i). Thus 

2(^1-. - z')mi - p)f - 0(/(^)'-') = 0((^M±i:y) , 

k=a P P 

for all e' > e. The result follows. □ 

Remark 3. All this applies easily to a nonnegative irreducible matrix M. Let p be its princi- 
pal eigenvalue: it is simple thanks to irreducibility. The spectral gap assumption p > I/I2I is 
guaranteed by an additionnal aperiodicity assumption. Let v and u be the left and right eigen- 
vectors of M for the eigenvalue p. We normalize u by N{u) - 1 where verifies ^(m) > and 
N{Au) = AN(u) (which implies '■§^(u) ■ u = N(u)) and v by Yji UiV, = L As m > 0, any normaliza- 
tion such that p - ^(m) > is satisfactory: for instance, we could choose N(u) - \\u\\i - 2; 
N(u) - \\u\\2 ov N(u) - Mj. 

Remark 4. Theorem[T]gives the possibility of performing a gradient algorithm for Perron vector 
optimization. Fix e and apply recursively the power-type iterations POWERDERIVATIVEm 
until ||m; - M/+1II + \\wi - vv7+i|| < e. Then we use w/m; as the descent direction of the algorithm. 
The gradient algorithm will stop at a nearly stationary point, the smaller e the better. In order to 
accelerate the algorithm, we can initialize the recurrence with former values of m;, v/ and vv;. 



4. Coupling gradient and power iterations 

We have given in Theorem [T] an algorithm that gives the derivative of the objective function 
at the same computational cost as the computation of the value of the function. As the problem 
is a differentiable optimization problem, we can perform any classical optimization algorithm: 
see IIBGLS06I IBer95l INW99I for references. 

When we consider relaxations of HITS autority or HOTS optimization problems, that we 
define in Sections |5] and |6] the constraints on the adjacency matrices are very easy to deal with, 
so a projected gradient algorithm as described in IIBer76 l will be efficient. If the problem has not 
a too big size, it is also possible to set a second order algorithm. However, matrices arising from 
web applications are large: as the Hessian matrix is a full x matrix, it is then difficult to 
work with. 

In usual algorithms, the value of the objective function must be evaluated at any step of the 
algorithm. As stressed in ||Ove91| , there are various possibilities for the computation of the 
eigenvalue and eigenvectors. Here, we consider sparse nonnegative matrices with a simple prin- 
cipal eigenvalue: the power method applies and, unlike direct methods or inverse iterations, it 
only needs matrix-vector products, which is valuable with a large sparse matrix. Nevertheless 
for large matrices, repeated principal eigenvector and eigenvalue determinations can be costly. 
Hence, we give a first order descent algorithm designed to find stationary points of Perron eigen- 
vector optimization problems, that uses approximations of the value of the objective and of its 
gradient instead of the precise values. Then we can interrupt the computation of the eigenvector 
and eigenvalue when necessary and avoid useless computations. Moreover, as the objective is 
evaluated as the limit of a sequence, its exact value is not available in the present context. 
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The algorithm consists of a coupling of the power iterations and of the gradient algorithm 
with Armijo line search along the projected arc [Ber76|. We recall this gradient algorithm in 
Algorithm[T] We shall, instead of comparing the exact values of the function, compare upper and 
lower bounds computed during the course of the power iterations. 

Algorithm 1 Gradient algorithm with Armijo line search along the projected arc |Ber76 | 
Let a differentiable function J, a convex admissible set C and an initial point xq e C and parame- 
ters cr e (0, 1), a" > and fi € (0, 1). The algorithm is an iterative algorithm defined for all e N 
by 

and - yS^'c" where mj. is the first nonnegative integer m such that 

J[?c{xk -y6"'a°V7(xi))) - J{xt) < -cr — ^ 



If we had an easy access to the exact value of u(M) for all M e h(C), we could use the 
gradient algorithm with Armijo line search along the projected arc with J-fouohto find 
a stationary point of Problem (j2|. But when computing the Perron eigenvector by an iterative 
scheme like in Theorem[r| we only have converging approximations of the value of the objective 
and of its gradient. The theory of consistent approximation, developped in [Pol971 proposes 
algorithms and convergence results for such problems. If the main applications of consistent 
approximations are optimal control and optimal control of partial derivative equations, it is also 
useful for problems in finite dimension where the objective is difficult to compute IP P02J . 

A consistent approximation of a given optimization problem is a sequence of computationally 
tractable problems that converge to the initial problem in the sense that the stationary points of the 
approximate problems converge to stationary points of the original problem. The theory provides 
master algorithms that construct a consistent approximation, initialize a nonlinear programming 
algorithm on this approximation and terminate its operation when a precision (or discretization) 
improvement test is satisfied. 

We consider the Perron vector optimization problem defined in Q 

min J(x) - min f o u o h(x) . 

.veC -veC 

For X e C, n e N and for arbitrary fixed vectors mq, vq and vvo, we shall approximate with order 
A(n) the Perron vectors of h{x), namely u and v and the auxiliary vector w of Corollary [2]by 

Vi„, vv,„) (POWERDERIVATIVEh(x))'"(Mo, v,,, vvq) , (5) 

where k,, is the first nonnegative integer k such that 

\\{uk+uVk+i,Wk+i) - {uk,Vk,Wk)\\ < A(n) (6) 

The map POWERDERTVATIVE is defined in Theorem [T] Then the degree n approximation 
of the objective function J and of its gradient Vy are given by 

Jnix) = /(Mi„) , gnix) = ^ Wk,Xi)'^hij(x)Uk^(j) (7) 
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An alternative approach, proposed in rPP021, is to approximate (m, v, w) by the n''^ iterate 
(u„, v„,w„) ;= (POWERDERIVATIVEh(x))"(Mo> ^o^ wq). We did not choose this approach since it 
does not take into account efficiently hot started power iterations. 

We define an approximate gradient step with Armijo line search in Algorithm |2] 

Algorithm 2 Approximate Armijo line search along the projected arc 

Let (M„)„>o be a sequence diverging to +00, cr e (0, 1), a" > 0, /? € (0, 1) and 7 > 0. Given 
n e N, J„ and g„ are defined in (|7]i. For x e C, the algorithm returns A„(x) defined as follows. If 
for all nonnegative integer m smaller than M„, 

/ n N \\x-Pc(x-/3'"a°g„(x))\\^ 

j„ {Pc(x-ro^°g„(x))) - ux) > -a'^ — 

then we say that the line search has failed and we set An(x) - 0. Otherwise, let m„ be the first 
nonnegative integer m such that 



7„ [Vc{x-I3'"a\„{x))) - J„(x) < -a 



|x-Pc(x-yS'«aV„(x))||^ 



and define the next iterate A„{x) to be A„{x) - Pc(x - /3'""a^^g„{x)). 



Then we shall use the Approximate Armijo line search along the projected arc A„ in the 
following Master Algorithm Model (Algorithm [3]l. 

Algorithm 3 Master Algorithm Model 3.3.17 in P Pol97l 

Let cj e (0, 1), cr' € (0, 1), n_i e N and xq e C, N ^ {n \ A„(x) + % , SxeC] and (A(n))„>o be 
a sequence converging to 0. 

For / e N, compute iteratively the smallest € TV and such that n,- > n,_i, 

x,+i e A„.{xi) and 
Jn:{Xi+\) - J„Xxi) < -o-'MriiT 



In order to prove the convergence of the Master Algorithm Model (algorithm [3]) when used 
with the Approximate Armijo line search (Algorithm [2]), we need the following lemma. 

Lemma 2. For all x* & C which is not stationary, there exists p* > Q, 6* > Q and n* e N such 
that for all n > n*, and for all x e B(x*,p*) n C, A„(x) + and 

^«((Pc(x - Ungnix))) - J„{x) < -6* 

where a„ is the step length returned by the Approximate Armijo line search A„(x) (Algorithm^. 

Proof. Let x e C. Suppose that there exists an infinitely growing sequence ((/>n)n>o such that 
A^^Xx) - for all n. Then for all m < M^,,, 

{Pcix -j6'"a%„(x))) - J^„(x) > -cTig^Mlx- Pc(x -^aVX^))) 
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When n +00, M^^^ +00, J^Jx) J(x) and g^„(x) — » V/(x) (Theorem [T|, so we get that 
for all men, j[Pc(xk -yS'"a°V7(x^))) - Jix^) > _o- ll-'-'-Pc(-'-'-^^"^;^°^-^(-''))ll2 ^ ^hich is impossible 
by IIBer76l . 

So suppose n € N is sufficiently large so that A_ (x) 0. Let a,, the step length determined by 
Algorithm |2] and let a be the step length determined by Algorithm [T] at x. We have: 

||x - Pc(x - angn(x))\\l 
Jn (PcU - dngnix))) - J„{x) < -CT 

a„ 

and if a„ + ck°, 

Wt. . . ^^^ ||x - Pc(x - a„^„(x))||2 
■/« (Pc(-« - j6 Cngn W) j - Jn\X) > -CT — 

(o:n)n>Q is a bounded sequence so it has a subsequence (a^J„>o converging to, say, a. As (a„)„>o 
can only take discrete values, this means that ff^^ - a for all n sufficiently big. 
When n tend to infinity, by Theorem[T] we get 



J (Pc(x - aVf(x))) - J(x) < -o- 



\x-Pc(x-aVf(x))\\l 



a 

and if a 5^ a", 

\\x-Pc(x-/3-'aVf(xml 



j{Pc(x-r'aVf(x))) - Jix) > -CT- 



Then, if a is the step length returned by Armijo rule (Algorithm [T]), then a> a, because a is the 
first number of the sequence that verifies the first inequality. Similarly, consider the version of 
Armijo rule with a strict inequality instead of the non strict inequality. Then if Cstrict is the step 
length returned by this algorithm, we have astrict ^ 

Moreover, like in IIPP02I one can easily see that Vx* e C not stationary, 3p* > 0, 36* > 0, 
such that Vx € B(x*,p*) n C, 



y((Pc(x- a,trictV/(x))) - J(x) < -6' 



where 5* = cr "''' '^"2 _ ||V7(x*)|^?*. By Lemma 3 in IIGB82I , a,t,iet < a implies that 



||x'-Pc(.i*-asiriciV/'U'))||2 \\x-VAx-a\ f(x))\\i .... „ „ ■ r , w » ^ 

N cy jK ;jii2 > N — ev_^^_n«ii2_ ^jj adherent pomt of (q'„)„>o, Vx* e C 

not stationary, 3p* > 0, 36* > and 3n* e N, such that V/i > n*, Vx e B(x*,p*) n C, 

w/r, / /XXX lk-Pc(-«-Q'strictV/(x))||2 

y„((Pc(x - Q'„g„(x))) - J„ix) < -0- ^ +6*/4< -6*12 , 

Q^stiict 

for n* sufficiently large and p* sufficiently small. □ 

In BPP02I . the property of the lemma was proved for exact minimization in the line search. 
We proved it for Algorithm |2] 

We shall also need the following result 
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Proposition 8 (Theorem 25 in |May94) ). Let M e M„,„(K), 1 e 
p e M" iMc/i f/zaf = 1. Denote 



B : 







C 


Ax - Ax 















In 



1-0-+ Va 

1, \A* -A\<I3 



o" = 114+1 - CB\\ and t - ||C|U- If cr < \ and A = (1 - o")^ - 477T > 0, then f5 = 
nonnegative and there exists a unique eigenpair {x* ,A*) of M such that p^x* 
and \ \x* - x||oo < y6. 

Theorem 2. Let (x,),>() be a sequence constructed by the Master Algorithm Model (Algorithm^ 
for the resolution of the Perron vector optimization problem (2]l such that A„(x) is the Approxi- 
mate Armijo line search along the projected arc (Algorithm^) and A(n) = (Aq)" /or Aq e (0, 1). 
Then every accumulation point of(xi)i>o is a stationary point of 

Proof. The proof of the theorem is based on Theorem 3.3.19 in IPol97L This theorem shows 
that if continuity assumptions hold (they trivially hold in our case), if for all bounded subset S of 
C there exist K > such that for all jc e C 

\Jix) - J„(x)\ < KA{n) , 

and if for all x* e C which is not stationary, there exists p* > 0, 6* > and n* eN such that for 
all n > n*, for all x e B(x*,p*) n C and for all y e A„(x), 

Jn(y) - J nix) < -6* , 

then every accumulation point of a sequence (ji:,),>o generated by the Master Algorithm Model 
(algorithm [3]l is a stationary point of the problem of minimizing J{x). 

We first remark that for x e C, u - u(h(x)) and J„ defined in (|7]i, as / is continuously 
difFerentiable, we have \ J(x) - J„{x)\ < ||V/(m)|| \\u - Ukjl- 

We shall now show that for all matrix M, there exists K > such that Hm-m,,!! < /TUMn+i -u„\\. 
Remark that Theorem 16 in | May94 1 gives the result with K > ^qj^ when M is symmetric. When 
M is not necessarily symmetric, we use Propositionjsjwith x = u, A = p and C is the inverse of B. 
Let e > 0, by continuity of the inverse, for n sufficiently large, if we define B„ to be the matrix of 
Proposition^ with x - u„ and A - N(Mu„), we still have cr :- \\I„+i - CB„\\ < e. We also have: 



C 



Mx - Ax 




< \\C\U\Mu„ - N(Mu„)u„\\ 



< \\C\\^N(Mu 

The conclusion of Propositionjsjtells us that if A := (1 - cr)^ - A-tjt > 0, then 

277 2ri 



</3: 



1 



< 377 < 3\\C\\ocN(Mu 



1 -0-+ Va 

Now, as the inversion is a continuous operation, for all compact subset S of C, there exists K > 
such that for all M - h(x) e h(C) and for all n sufficiently big, \\u - u„\\ < K\\u„+i - u„\\. 
By definition of k„ (|6|, we get 

\J(x) - J„(x)\ < ||V/(m)||||m - M,J| < K\\uk„^i - m,J| < KA(n) . 

By Lemma|2] the other hypothesis of Theorem 3.3.19 in llPol97ll is verified and the conclusion 
holds: every accumulation point of ixi)i>o is a stationary point of (|2]l. □ 
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5. Application to HITS optimization 



In the last two sections, we have developped scalable algorithms for the computation of the 
derivative of a scalar function of the Perron vector of a matrix and for the searching of stationary 
points of Perron vector optimization problems Q. We now apply these results to two web 
ranking optimization problems, namely HITS authority optimization and HOTS optimization. 

HITS algorithm for ranking web pages has been described by Kleinberg in |Kle99|. The 
algorithm has two phases: first, given a query, it produces a subgraph G of the whole web graph 
such that in contains relevant pages, pages linked to relevant pages and the hyperlinks between 
them. The second phase consists in computing a principal eigenvector called authority vector 
and to sort the pages with respect to their corresponding entry in the eigenvector. If we denote 
by A the adjacency matrix of the directed graph G, then the authority vector is the principal 
eigenvector of A^A. 

It may however happen that A^A is reducible and then the authority vector is not uniquely 
defined. Following fLM061, we remedy this by defining the HITS authority score to be the 
principal eigenvector of A^A + ^ee^, for a given small positive real ^. We then normalize the 
HITS vector with the 2-norm as proposed be Kleinberg IIKle99L 

Given a subgraph associated to a query, we study in this section the optimization of the au- 
thority of a set of pages. We partition the set of potential links (/, j) into three subsets, consisting 
respectively of the set of obligatory links O, the set of prohibited links I and the set of facul- 
tative links ;F. Some authors consider that links between pages of a website, called intra-links, 
should not be considered in the computation of HITS. This results in considering these links as 
prohibited because this is as if they did not exist. 

Then, we must select the subset J of the set of facultative links f which are effectively 
included in this page. Once this choice is made for every page, we get a new webgraph, and define 
the adjacency matrix A - A{J). We make the simplificating assumption that the construction of 
the focused graph G is independent of the set of facultative links chosen. 

Given a utility function /, the HITS authority optimization problem is: 

max {/(m) ; iA{jfA{J) + fee'^)u = Au , \\u\\2 = 1 , m > 0) (8) 

Jcr,ueR",AeR 

The set of admissible adjacency matrices is a combinatorial set with a number of matrices 
exponential in the number of facultative links. Thus we shall consider instead a relaxed version 
of the HITS authority optimization problem which consists in accepting weighted adjacency 
matrices. It can be written as 

max f(u) 

AeR">'",HeR»,peR 
(A^A + ^ee^)u - pu , \\u\\2 - 1 , u>0 

A,j = 1 , V(/,j)eO (9) 
A,-,, = 0, V(/,;)e J 
0<A,j< 1 , V(iJ)er 
The relaxed HITS authority optimization problem (|9]l is a Perron vector optimization prob- 
lem ^ with h{A) = A^A + ^ee^ and the normalization N(u) - ^Yji - 1. Hence VN(u{M)) - 

u(M). Remark that ||m|| = ||v|| - 1. Now ^{A).H - H^A + A^H so the derivative of the crite- 
rion with respect to the weighted adjacency matrix is (Aw)u^ + {Au)w^ with w = (V/^ - (V/ ■ 
uWN^XA^A + ^ee^ - pif . 
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Thanks to ^ > 0, the matrix is irredutible and aperiodic. Thus, it has only one eigenvalue of 
maximal modulus and we can apply Theorem |2] 

The next proposition shows that, as is the case for PageRank optimization BdKN vD08 1 IFAB GTOll . 
optimal strategies have a rather simple structure. 

Proposition 9 (Threshold property). Let A be a locally maximal linking strategy of the relaxed 
HITS authority optimization problem (|9| with associated authority vector u and derivative at 
optimum (Aw)m^ + (Am)w^. For all controlled page i denote = Iau) ' ^'^^ least one 
outlink. Then all facultative hyperlinks (/, j) such that ^ > bi get a weight of 1 and all facultative 
hyperlinks (i, j) such that ^ < bi get a weight ofO. 

In particular, if two pages with different bj's have the same sets of facultative outlinks, then 
their set of activated outlinks are included one in the other. 

Proof. As the problem only has box constraints, a nonzero value of the derivative at the maxi- 
mum determines whether the upper bound is saturated {gij < 0) or the lower bound is saturated 
{gij > 0). If the derivative is zero, the weight of the link can take any value. 

We have gjj = (Aw),Mj + {Au)iWj with uj > and (Am),- > 0. If Page / has at least one outlink, 
then (Am), > and we simply divide by (Am),- to get the result thanks to the first part of the proof. 
If two pages ii and ;2 have the same sets of facultative outlinks and if fe,, < fe,-,, then > bi^ 

implies ^ > all the pages pointed by /2 are also pointed by /]. □ 

Remark 5. If a page / has no outlink, then (Aw), = (Am),- = and gi j - for all j e [n], so we 
cannot conclude with the argument of the proof. 

Remark 6. This proposition shows that ^ gives a total order of preference in pointing to a page 
or another. 

Then we give on Figures [T] and |2] a simple HITS authority optimization problem and two 
local solutions. They show the following properties for this problem. 

Counter example 1. The relaxed HITS authority optimization problem is in general not quasi- 
convex nor quasi-concave. 

Proof. Any strict local maximum of a quasi-convex problem is necessarily an extreme of the 
admissible polyhedral set (this is a simple extension of Theorem 3.5.3 in liBS06J ). The example 
on Figure [T]shows that this is not the case here. 

A quasi-concave problem can have only one strict local maximum (although it may have 
many local maxima). The examples on Figures[T]and|2]show two distinct strict local maxima for 
a HITS authority optimization problem. □ 

Heuristic 1. These examples also show that the relaxed HITS authority optimization problem (|8]l 
does not give binary solutions that would be then solutions of the initial HITS authority optimiza- 
tion problem Hence we propose the following heuristic to get "good" binary solutions. From 
a stationary point of the relaxed problem, define the function (p : [0, 1] — > M such that 0(x) is 
the value of the objective function when we select in (|8]l all the links with weight bigger than x 
in the stationary point of (|9|. We only need to compute it at a finite number of points since (p is 
piecewise constant. We then select the best threshold. For instance, with the stationary point of 
Figure [T] this heuristic suggests not to select the weighted link. 
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Figure 1 : Strict local maximum for relaxed HITS authority optimization on a small web graph of 2 1 pages with 3 con- 
trolled pages (colored) representing the website /. Obligatory links are the thin arcs, facultative Hnks are all the other 
outlinks from the controlled pages except self links. The locally optimal solution for the maximization of f(u) = 
is to select the bold arcs with weight 1 and the dotted arc with weight 0.18. Selected internal links are dark blue, selected 
external links are light red. We checked numerically the second order optimality conditions liBer95i . 




Figure 2: Another strict local maximum for the same HITS authority optimization problem as in Figure[T] 
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6. Optimization of HOTS 



6.1. Tomlin 's HOTS algorithm 

HOTS algorithm was introduced by Tomlin in IITom03l . In this case, the ranking is the vector 
of dual variables of an optimal flow problem. The flow represents an optimal distribution of web 
surfers on the web graph in the sense of entropy minimization. It only takes into account the web 
graph, so like PageRank, it is query independent link-based search algorithm. The dual variable, 
one by page, is interpreted as the "temperature" of the page, the hotter a page the better. Tomlin 
showed that this vector is solution of a nonlinear fix point equation: it may be seen as a nonlinear 
eigenvector. Indeed, most of the arguments available in the case of Perron vector optimization 
can be adapted to HOTS optimization: we show that the matrix of partial derivatives of the 
objective has a low rank and that it can be evaluated by a power-type algorithm. Also, as for 
PageRank and HITS authority optimization, we show that a threshold property holds. 

Denote G = (V,E) the web graph with adjacency matrix A - (Aij) and consider a modified 
graph G' = (V, E') where V" = V U {« -i- 1) and £" = £ U (U,gv{/, « + 1)) U (iJjevin + 1, ;)). Fix 
a e (1 /2, 1). Then the maximum entropy flow problem considered in LTomOSJ is given by 

max - y p^log(pJ 

^ eeE' 
jeV jeV 
ijev 

^P«+i,;=l-a' (a„+i) 

jeV 

I - a - y^p;,,i+i {b„+i) 
ieV 

The dual of this optimization problem is 

min ()(p,fi, a„^ubn+i) := V Ai|eP'-P'^^' + V 

(10) 



2' 



g«„„-p,.;. - (1 - a)fl„+i - A' + (1 - a)K^x 

7e[n] 

This problem is a form of matrix balancing (see IISch90l ). For p being an optimum of Prob- 
lem ( [TO| i, the HOTS value of page / is defined to be exp(/7,). It is interpreted as the temperature 
of the page and we sort pages from hotter to cooler. 

The dual problem ( [TO) l consists in the non constrained minimization of a convex function, 
thus a necessary and sufficient optimality condition is the cancelation of the gradient. From these 
equations, we can recover a dual form of the fix point equation described in | Tom03 ] . Note that 
we take the convention of llSch90l so that our dual variables are the opposite of Tomlin's. 

From the expressions of a„+\, bn+\ and at the optimum, respectively given by e""*' = 
„ ^'"-n e~^, e"*"+' - V '~"„. and - ^ — ^°r^ „-„. , we can write as a function of p only: 

0{p)^ min e(p,id,a„+ub„+i) 

(11) 



C(a) + cf>(-p) + <p(p) + (2a - l)log( ^ A,-,e'''-''0 

',>£[«] 
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where CCff) = l-2(l-Q')log(l-Q')-(2Q'-l)log(2Q'-l) and where 0(/7) = (l-a')log(2,e[„] e^')- 
Its gradient is given by 



-(p) = -(1 - a)—— + (1 - a)—- - (2a - I) ^' + {2a - 1), ^ 



This equaHty can be also written as 

2p,l.. I^jAijC-P' l-a\ Y^jAiie"- , I -a , ^ ^ 

e " (2a - 1)— h — = (2(3' - 1)— + — + e" - — (») 

which yields for all p in M", 

1 v-i v-i Y^iiAijeP'^P' 30 

PI = -[log((2 A^e^'X^i ^'") + 2a _ I ""^ ^"'g-(P))) 
I j Pi 

- log(^ e-P^ - \og{{Y,A,je-P'){Y^ eP-) + ^^'V^"""^) + ^og^Z 

J j ' U ' 

Let u be the function from M" to R" defined by 

- log((2 Aije-P')i^ eP-) + Aij'^'-'") + log(2 ^"'Ol ■ 

j 'J ! 

Using the formula log(A) - \og{A + B) - log(l + B/A), we may also write it as 

1 30 
uiip) ^ pi- ^ log(l + di—ip)) . 

2 dpi 

where 

^^-'(i:,^-^o(i:,vA,,./'.-^o ^ 

' (2a-l)(2:,.A,7e/'0(Z;e-'^0 + (l-a)i:,;yA,ve''-'^.' 

We can see that the equation u{p) - p is equivalent to §^(p) - but also that successively 
applying the function u corresponds to a descent algorithm. This is the dual form of Tomlin's 
algorithm for the computation of HOTS values. Note that we do not compute the values of fl„+i, 
b„+i and p but give an explicit formula of the fix point operator. 

The next proposition gives information on the spectrum of the hessian of the function 6. 

Proposition 10. The hessian of the function ( |1 l| l is symmetric semi-definite with spectral norm 
smaller that 4. Its nullspace has dimension 1 exactly for all p and a basis of this nullspace is 
given by the vector e, with all entries equal to 1. 

Proof. As is convex, its hessian matrix is clearly symmetric semi-definite. 
Now, let : X i-> log(2, e'O be the log-sum-exp function. 

We have y V (p{x)y - -^fpi — g p'')^ ' ^^^^ expression is strictly positive for any non 
constant y, because a constant y is the only equality case of the Cauchy Schwartz inequality 
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X/yi-e-^'^^e-*'^^ < (Hi e-^')'^^(Ii, y- e'^O'^^- As the function is the sum of (p (the third term of ( fTT) ) 
and of convex functions, it inherits the strict convexity property on spaces without constants. This 
development even shows that the kernel of V^0(p) is of dimension at most 1 for all p. Finally, as 
9 is invariant by addition of a constant, the vector e is clearly part of the nullspace of its hessian. 

For the norm of the hessian matrix, we introduce the linear function Z : M" — » M"^" such 
that (Zp)ij = Pi - pj and ^ : z log(Ete[«]x[«] ^te")- Then e(p) = C(a) + (1 - a)(f>(p) + (1 - 
a)cl>(-p) + (2a - mZp). 

< y^V^4>(x)y = - < 1^ < Mi < \\y\\i Thus HV^Ib < 1. By similar 

calculations, one gets y^Z^^^{x)Zy < \\Zy\\l,. As HZylU = max,_j|y,- - yj\ < 2||y|U < 2||y||2, we 
have that ||V^0||2 <(1-q') + (1-q') + {2a - I) x 4 = 6a - 2 < 4-. Finally, for symmetric matrices, 
the spectral norm and the operator 2-norm are equal. □ 

6.2. Optimization of a scalar function of the HOTS vector 

As for HITS authority in Section|5] we now consider sets of obligatory links, prohibited links 
and facultative links. From now on, the adjacency matrix A may change, so we define 9 and u as 
functions of p and A. For all A, the HOTS vector is uniquely defined up to an additive constant 
for a < 1, so we shall set a normalization, like for instance Y^iPi = or log(2, exp(/?,)) - 0. 
Thus, given a normalization function A^, we can define the function /? : A i-» p{A). For all A, /, j, 
the normalization function may verify |^(;?(A))^(A) = and N{p + X) - N{p) + A for all 

AeR, so that |^(/:')e = 1. 

The HOTS authority optimization problem is: 

max ifip) ; u(A(J), p) ^ p , N(p) = , ) (13) 

Jcr,peR",AeR 

We shall mainly study instead the relaxed HOTS authority optimization problem which can be 
written as: 

max /(m) 

u(A,p) = p , N(p) = 

A,.,; = 1 , V(/,7)€0 (14) 

A,,y = 0, ^(i,j)el 
0<A,,y< 1 , V(i,j)er 

where / : K" — » M is the objective function. We will assume that / is difFerentiable with 
gradient V/. 

It is easy to see that m(A, ■) is additively homogeneous of degree 1, so the solution p of the 
equation u{A,p) - p may be seen as a nonlinear additive eigenvector of m(A, ■)■ In this section, 
we give the derivative of the HOTS vector with respect to the adjacency matrix. 

Proposition 11. The derivative of f o p is given by ,■ = y / w/c? . where 

w = (-V/^ + (V/^e)VA^^)(0)* 
and c'. J — , . Moreover, the matrix (gij)ij has rank at most 3. 
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Proof. Let us differentiate with respect to A, ^ the equation 



pi{A) = ui{A,p{A)) = pAA) - \ logd + d,{A,p{A))^{A,p(A))) 

2 dpi 



dpi _ Dm,(A,MA)) 

( He/;) h di — ) 



dAij TlAij 

Bpi 1 1 Dt/, 50 5/?* 



5Ay 2 1 +^/,|^^DA,,,5/?, 'Aj5/?,5Ay 'BpiBptBAi,^ 
But as ^(A, p(A)) = and di{A, p{A)) > 0, we get for all A, /, I: 

X-^(A,p(A))^(A).--^(A,,(A)) 
^ BpiBpk BAij BpiBAij 

Or more simply, |^ = Multiplying by using the fact that the nullspace of |^ is 

the multiples of e (PropositionfTO]) and using Q yields: 

Bp __^d^-0 
n ' dAi j Bp 



(^- -ir)i^--&u ■ (15) 



Multiplying by VA^^ gives ^e^^ = VN'if^fcij- We then use VA^' e = 1, we reinject 

in ([B]) and we multiply by to get the result V/^^ = {-Vf + (V f e)VN'^){'^_fcij. 
Finally, the equality 

pjln 2(T — 1 

'■^ 5/?,5Ai,j i;,v,jvA,v.^ve/v /V 

y -; / /i»''''"'''-y / A, ,e'''^''i' 

where Bi - — — '— — - — j7-t-> 7 shows that the matrix (e,- ,),■ , has rank at most 3 since we can 

write it as the sum of three rank one matrices. □ 

This proposition is the analog of Corollary |2j the latter being for Perron vector optimization 
problems. It both cases, the derivative has a low rank and one can compute it thanks to a Drazin 
inverse. Moreover, thanks to Proposition [To| one can apply Proposition [t] to M - I„ - jV^O and 
z' - —\z- Indeed, M has all its eigenvalues within (-1, 1], 1 is a single eigenvalue with ee^ jn 
being the associated eigenprojector So, for HOTS optimization problems as well as for Perron 
vector optimization problems, the derivative is easy to compute as soon as the second eigenvalue 
of is not too small. 

Remark that Proposition 1 1 is still true if we replace by diag(t/)V^6' and Ci j by diag((i)c, j. 



We conjecture that 1 is still the principal eigenvalue of /„ - idiag(t/)V^0 and that the spectral gap 
is larger than the one of /„ - jV^fl. Numerical experiments seem to confirm this conjecture. 
For HOTS optimization also, we have a threshold property. 

Proposition 12 (Threshold property). Let A he a stationary point for the relaxed HOTS opti- 
mization problem ^14| l with associated HOTS vector p and let w be defined as in Proposition \ll\ 

Let B — ^'^^ • Then for all facultative link (i,j), Wj > Wj + B implies that 

Ajj - 1 and wj < wi + B implies that Aij — 0. 
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Figure 3: Strict local maximum for HOTS optimization on the small web graph of Figure[T] The locally optimal solution 
for the problem of maximizing /(«) = exp(p,) presented here is to select the bold arcs with weight 1. If one replaces 
the arc from 20 to 17 by the arc from 17 to 20 one gets another strict local optimal solution but with a smaller value 
(0.166 instead of 0.169). This shows that the problem is not quasi-concave. 

Proof. A simple development of c'. j in Proposition jTTj shows that the derivative of the objective 

is given by g^j = Y^^^^^'""""'^''' ~ ^> + 

The result follows from the fact that the problem has only box constraints. Indeed, a nonzero 
value of the derivative at the maximum determines whether the upper bound is saturated j < 0) 
or the lower bound is saturated {gij > 0). If the derivative is zero, then the weight of the link can 
take any value. □ 

Remark 7. This proposition shows that w gives a total order of preference in pointing to a page 
or another. 

6.3. Example 

We take the same web site as in Figures [T|and[2]with the same admissible actions. We choose 
the objective function f{p) - Y^iei ^WiPi) the normalization N(u) = log(2,g/ exp(/:i,)) = 0. 
The initial value of the objective is 0.142 and we present a local solution with value 0.169 on 
Figure |3] 

7. Numerical results 

By performing a crawl of our laboratory website and its surrounding pages with 1,500 pages, 
we obtained a fragment of the web graph. We have selected 49 pages representing a website /. 
We set r, - 1 if / e / and r, = otherwise. The set of obligatory links were the initial links 
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Sum of HITS authority scores 
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0.05 - 








5 10 15 20 25 30 35 

Gradient iterate 

Figure 4: Optimization of the sum of HITS authority scores. The figure shows that the objective is increasing during the 
course of the algorithm. The sum of authority values jumps from 3.5e-6 to 0.19. However, despite this big progress at 
the beginning, convergence is slow. This is a typical situation with first order methods and nonconvex optimization. 





Matrix assemblings 


Power iterations 


Time 


Gradient (Equation (|4]i) 


545 




304 s 


Gradient (Remark 4 1 


324 


56,239 


67 s 


Coupled iterations (Th. 2 1 


589 


14,289 


15 s 



Table 1 : Comparison of gradient algorithm witli the evaluation of the gradient done by direct resolution of Equation j4j 
by Matlab "mrdivide" function, gradient algorithm with hot started power iterations described in Remark |4] (precision 
10"') and coupled gradient and power iterations. The goal was to reach the value 0.22 on our laboratory dataset (the best 
value we found was 0.2285). For this problem, coupling the power and gradient iterations makes a speedup of more than 
four. 



already present at time of the crawl, the facultative links are all other links from controlled pages 
except self-links. 

We launched our numerical experiments on a personal computer with Intel Xeon CPU at 
2.98 Ghz and 8 GB RAM. We wrote the code in Matlab language. We refer to MFABGIOI for 
numerical experiments for PageRank optimization. 

7.1. HITS authority optimization 

As in Section |5] we maximize the sum of HITS authority scores on the web site, that is we 
maximize /(«) - r,M^ under the normalization N(u) - (2,g[„] riu^y^^ - 1. 

We use the coupled power and grandient iterations described in Section |4] We show the 
progress of the objective on Figure |4] and we compare coupled power and grandient iterations 
with classical gradient in Table [T] 

The best strategy of links found has lots of binary values: only 4569 values different from 1 
among 11,516 nonnegative controled values. Moreover, the heuristic described in Section[5]gives 
a 0-1 matrix with a value at 0.07% from the weighted local optimum found. It consists in adding 
all possible links between controlled pages (internal-links) and some external links. Following 
Proposition |9j as the controlled pages share many facultative outlinks, we can identify growing 
sequences in the sets of activated outlinks. 

7.2. HOTS optimization 

Here, we cannot use the coupled power and gradient iterations since we do not have any 
effective bound for the distance between the actual iterate and the true HOTS vector. We thus 
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Figure 5: Strict local optimal solution of the HOTS optimization problem. We present the adjacency matrix restricted to 
the set of controlled pages. There are no facultative external links added. We sorted pages by their w value. The rounded 
black dots are the obligatoiy links and the blue dots are the activated facultative links. We can see that the higher the 
w value, the bigger the number of controlled pages pointing to that page and the smaller the number of pages that page 
points to (Proposition[T2j. It is worth noting that this local solution of the relaxed problem has only binary weights. 



solve the HOTS optimization problem with class ical gradient. We computed the derivative by 
assuming the conjecture at the end of Section 6.2 Indeed, the second eigenvalue of is small 



while the second eigenvalue of diag((f)V^0 is larger. We also never encountered a case where the 
largest eigenvalue of diag(ii)V^6' is bigger than 4. 

We consider the same website as for the numerical experiments for HITS authority. We take 
as objective function f(p) - Y^iei exp(/?,) under the normalization N{u) - log(2,£[„] exp(/?,)) - 0. 

Here, a stationnary point is reached after 55 gradient iterations and 8.2 s. The initial value of 
the objective was 0.0198 and the final value returned by the optimization algorithm was 0.0567. 
We give a local optimal adjacency matrix on Figure |5] 



7.3. Scalability of the algorithms 





HITS (CMAP) 


HOTS (CMAP) 


HITS (NZU) 


HOTS (NZU) 


Gradient (Eq. (|4 1) 


1.01 s/it 


0.82 s/it 


out of memory 


out of memory 


Gradient (Rem. 4 




0.23 s/it 


0.12 s/it 


20 s/it 


62 s/it 


Coupled iter. (Th. 


2i 


0.038 s/it 


0.04 s/it 


1.0 s/it 


2.9 s/it 



Table 2: Mean execution times by gradient iterations for HITS authority and HOTS optimization for 2 fragments of the 
web graph: our laboratory web site and sun'ounding web pages (CMAP dataset, 1,500 pages) and New Zealand Uni- 
versities websites (NZU dataset, 413,639 pages). The execution time with a direct resolution of Equation j4j by Matlab 
"mrdivide" function becomes very memory consuming when the size of the problem grows. It is still acceptable with 
matrices of size 1,500 but fails for larger matrices. The other two algorithms scale well. For, HOTS optimization prob- 
lems, as they are not Perron vector optimization problems, Theorem[2]does not apply. However, we have implemented 
the algorithm anyway. The execution time mainly depends on the spectral gap of the matrix, which is similar in the 
four cases and on the cost by matrix-vector product, which is growing only linearly thanks to the sparsity of the matrix. 
The number of gradient iterations is in general not dependent on the size of the problem: in our experiments for HITS 
authority optimization, there were even less gradient iterations with the larger dataset. When the coupled iterations is 
available, it gives a speedup between 3 and 20. 
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We launched our algorithms on two test sets and we give the execution times on Table [2] 
Our laboratory's website is described at the beginning of this section. The crawl of New Zealand 
Universities websites is available at IIPro06l . We selected the 1,696 pages containing the keyword 
"maori" in their url and 1,807 possible destination pages for the facultative hyperlinks, which 
yields 3,048,798 facultative hyperlinks. In both cases, we maximize the sum of the scores of the 
controlled pages. 

We remark that for a problem more that 300 times bigger, the computational cost does not 
increase that much. Indeed, the spectral gap is similar and the cost by matrix-vector product is 
growing only linearly thanks to the sparsity of the matrix. 
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